Khiops: A Statistical Discretization Method of Continuous Attributes
نویسندگان
چکیده
منابع مشابه
Khiops: A Discretization Method of Continuous Attributes with Guaranteed Resistance to Noise
In supervised machine learning, some algorithms are restricted to discrete data and need to discretize continuous attributes. The Khiops* discretization method, based on chi-square statistics, optimizes the chi-square criterion in a global manner on the whole discretization domain. In this paper, we propose a major evolution of the Khiops algorithm, that provides guarantees against overfitting ...
متن کاملDiscretization of Continuous Attributes
In the data-mining field, many learning methods — such as association rules, Bayesian networks, and induction rules (Grzymala-Busse & Stefanowski, 2001) — can handle only discrete attributes. Therefore, before the machine-learning process, it is necessary to re-encode each continuous attribute in a discrete attribute constituted by a set of intervals. For example, the age attribute can be trans...
متن کاملDynamic Discretization of Continuous Attributes
Discretization of continuous attributes is an important task for certain types of machine learning algorithms. Bayesian approaches, for instance, require assumptions about data distributions. Decision Trees, on the other hand, require sorting operations to deal with continuous attributes , which largely increase learning times. This paper presents a new method of discretization, whose main char...
متن کاملCompression-Based Discretization of Continuous Attributes
Discretization of continuous attributes into ordered discrete attributes can be beneecial even for propositional induction algorithms that are capable of handling continuous attributes directly. Beneets include possibly large improvements in induction time, smaller sizes of induced trees or rule sets, and even improved predictive accuracy. We deene a global evaluation measure for discretization...
متن کاملOn Exploring Soft Discretization of Continuous Attributes
Searching for a binary partition of attribute domains is an important task in data mining. It is present in both decision tree construction and discretization. The most important advantages of decision tree methods are compactness and clearness of knowledge representation as well as high accuracy of classification. Decision tree algorithms also have some drawbacks. In cases of large data tables...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Learning
سال: 2004
ISSN: 0885-6125
DOI: 10.1023/b:mach.0000019804.29836.05